RSS Pre-Conference Workshop 2023
Visualisation is a key skill in your data science tool kit:
Reflective exercise, not a tutorial or rulebook.
Coffee consumption, visualised. Jaime Serra Palou.
Caffeination vs sleep, shown in lego. Elsie Lee-Robbins
{ggplot2}Layered creation of graphics from tidy data.
Learning {ggplot2}:
Use cases: exploratory analysis, presentation, report / paper, data journalism.
Considerations:
Who is the intended audience for your visualisation?
What knowledge do they bring with them?
What assumptions and biases do they hold?
Creating personas for distinct user groups can be helpful.
Issues with scales, area and perspective
Captions
Describes a figure or table so that it may be identified in a list of figures and (where appropriate).
Alternative text
Describes the content of an image for a person who cannot view it. (Guide to writing alt-text)
Titles
Give additional context or identify key findings. Active titles are preferable.
Graph to show how X varies with Y
Great for multi-output documents but many flavours.
Github / Jupyter:
Quarto:
When using literate programming alt-text can be added as code block meta-data.
In Quarto:
Where does your purpose fall on this triangle?
No such thing as neutral presentation.
Start with a hook.
Decisions cost time, energy and money. (DRY)
Consider your design choices carefully and write down your decisions and reasoning. (DRY)
This will form the basis of your own style-guide for data visualisation.
The Pudding (learning resources)
Quarto, R and {ggplot}.
Blog post: women in politics.
General audience, familiar with UK politics.
Representation of women in parliament is improving over time.
Style guidelines of blog and political parties.
Think about your tools
Think about your medium
Think about your audience
Think about your story
Think about your guidelines
The Climate Book - Penguin
Coffee Cup - Jaime Serra Palou
Lego coffee - Elsie Lee-Robbins via Twitter
Pre-attentive attributes - Adapted from Better Data Visualizations
Male Heights - patient.info
Desaturated colour scales - {viridis} documentation
RSS Best Practices for Data Visualisation
How to make data outputs more readable, accessible, and impactful.
11:40-13:00 Tuesday, 5 September, 2023, Auditorium
Split into 5 groups and spread around the room.
Try to draw the plot based on the alt-text provided.
08:00
Now for the inverse problem:
Write your own alt-text based on the plot provided.
08:00
Put your alt-text to the test:
Pass your alt-text to another group. They have to try and recreate your plot!
08:00
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
locale: en_US.UTF-8||en_US.UTF-8||en_US.UTF-8||C||en_US.UTF-8||en_US.UTF-8
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: dplyr(v.1.1.2), ggplot2(v.3.4.0) and datasauRus(v.0.1.6)
loaded via a namespace (and not attached): Rcpp(v.1.0.9), cellranger(v.1.1.0), compiler(v.4.2.2), pillar(v.1.9.0), sysfonts(v.0.8.8), tools(v.4.2.2), digest(v.0.6.31), jsonlite(v.1.8.4), evaluate(v.0.20), lifecycle(v.1.0.3), tibble(v.3.2.1), gtable(v.0.3.1), pkgconfig(v.2.0.3), png(v.0.1-7), rlang(v.1.1.0), cli(v.3.6.0), rstudioapi(v.0.14), yaml(v.2.3.6), countdown(v.0.4.0), xfun(v.0.36), fastmap(v.1.1.0), showtextdb(v.3.0), withr(v.2.5.0), stringr(v.1.5.0), knitr(v.1.41), generics(v.0.1.3), vctrs(v.0.6.2), grid(v.4.2.2), tidyselect(v.1.2.0), glue(v.1.6.2), R6(v.2.5.1), fansi(v.1.0.3), readxl(v.1.4.3), rmarkdown(v.2.19), pander(v.0.6.5), whisker(v.0.4.1), farver(v.2.1.1), purrr(v.1.0.1), tidyr(v.1.2.1), magrittr(v.2.0.3), prismatic(v.1.1.1), ellipsis(v.0.3.2), scales(v.1.2.1), htmltools(v.0.5.4), showtext(v.0.9-5), zvplot(v.0.0.0.9000), colorspace(v.2.1-0), labeling(v.0.4.2), utf8(v.1.2.2), stringi(v.1.7.12) and munsell(v.0.5.0)
Go Figure! RSS Pre-Conference Workshop 2023 - Zak Varty